Bayesian Statistics
Table of Contents
- Statistics under the Bayesian probability interpretation.
1. Principle of Maximum Entropy
The probability distribution which best represents the current state of knowledge about a system is the one with largest entropy.
1.1. Principle of Indifference
In the absence of any evidence, the credence—the degree of belief—should be equally distributed among all possible outcomes.
2. Prior Probability
- Probability distribution before taking the evidences into account.
2.1. Strength
- The certainty upon the system. Strong prior would change little.
3. Bayes' Theorem
- Bayes' Law, Bayes' Rule
3.1. Statement
- \[ \operatorname{P}[A|B] = \frac{\operatorname{P}[B|A]\operatorname{P}[A]}{\operatorname{P}[B]} \] where \(\operatorname{P}\) is the probability of the events \(A\) and \(B\).
- According to the Bayesian probability
interpretation:
- \(\operatorname{P}[A|B]\) is the posterior probability of \(A\) given \(B\).
- \(\operatorname{P}[B|A]\) is the likelihood of \(A\) given a fixed \(B\), since \(\operatorname{P}[B|A] = \operatorname{L}[A|B]\).
- \(\operatorname{P}[A]\) is the prior probability.
- \(\operatorname{P}[B]\) is the marginal probability.
4. Conjugate Distribution
If prior distribution and the posterior distribution is in the same probability distribution family, then the prior and posterior are called conjugate distributions, and the prior is called a conjugate prior for the likelihood function.
5. Bayes Estimator
- Bayes Action
Estimator or decision rule that minimizes the posterior expected value of a loss function.
6. Maximum A Posteriori Probability Estimator
- MAP Estimator
6.1. Description
The maximum likelihood estimate of \(\theta\): \[ \hat{\theta}_{\rm MLE}(x) = \operatorname*{arg\ max}_{\theta} f(x\mid\theta) \] can be generalized to include the prior distribution \(g(\theta)\) using Bayes' theorem:
\begin{align*} \hat{\theta}_{\rm MAP}(x) &= \operatorname*{arg\ max}_{\theta}\frac{f(x\mid \theta)g(\theta)}{\int_\Theta f(x\mid\vartheta)g(\vartheta)d\vartheta} \\[10pt] &= \operatorname*{arg\ max}_{\theta} f(x\mid \theta)g(\theta). \end{align*}